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ABSTRACT 

While recent supernova cosmology research has benefited from improved measurements, current 
analysis approaches are not statistically optimal and will prove insufficient for future surveys. This 
paper discusses the limitations of current supernova cosmological analyses in treating outliers, se¬ 
lection effects, shape- and color-standardization relations, unexplained dispersion, and heterogeneous 
observations. We present a new Bayesian framework, called UNITY (Unified Nonlinear Inference 
for Type-la cosmologY), that incorporates significant improvements in our ability to confront these 
effects. We apply the framework to real supernova observations and demonstrate smaller statistical 
and systematic uncertainties. We verify earlier results that SNe la require nonlinear shape and color 
standardizations, but we now include these nonlinear relations in a statistically well-justified way. 

This analysis was primarily performed blinded, in that the basic framework was first validated on 
simulated data before transitioning to real data. We also discuss possible extensions of the method. 

Subject headings: cosmology: dark energy, methods: statistical, supernovae: general 


1. INTRODUCTION 

Recent supernova (SN) cosmological measurements 
have greatly reduced both the statistical and systematic 
uncertainty in our knowledge of the accelerated expan- 
sio n of the universe (th e latest such ef f orts a r e presented 
in iSuzuki et al.l l2012t iBetoule et al.l 12014 iRest et ^ 
120141) . Despite these improvements, the frameworks cur¬ 
rently in use are not statistically optimal. As we build 
larger supernova samples, these frameworks will become 
increasingly inadequate. 

This paper offers an improved technique for deriving 
constraints on cosmological parameters from SN mea¬ 
surements (peak magnitude, light-curve shape and color, 
host-galaxy mass). Although this paper uses SN cos¬ 
mology as an example, researchers from other fields may 
find this type of framework useful. Our work is particu¬ 
larly relevant for researchers confronting partially known 
uncertainties, selection effects, correlated measurements, 
and outliers. The rest of this introduction summarizes 
the problems with existing methods and describes in gen¬ 
eral terms the basic requirements for greater accuracy. 
Section[2]describes the framework in detail, and Section[3] 
quantitatively examines the performance of the method 
on simulated data. In Section |4l we demonstrate the 
performance on real SN observations, then conclude in 
Section [5] with future directions. 

^ Department of Physics, Florida State University, Tallahas¬ 
see, FL, 32306 

^ E.O. Lawrence Berkeley National Lab, 1 Cyclotron Rd., 
Berkeley, CA, 94720 

® Department of Physics, University of California Berkeley, 
Berkeley, CA 94720 

Space Telescope Science Institute, 3700 San Martin Drive, 
Baltimore, MD 21218 

® Australian Astronomical Observatory, PO Box 296, Epping, 
NSW 1710, Australia 

® Institut fur Physik, Newtonstr. 15, 12489 Berlin, Humboldt- 
Universitat zu Berlin 


I.l. The Current Approach 

Current cosmological constraints are derived from time 
series of photometric measurements in multiple bands 
(light curves) and spectroscopy. Before obtaining SN 
distances, t he light curves must be fit with a model suc h 
as SALT2 (ICuv et al.l[2?M Imnl: IMosher et alTMl . 
SALT2 models each photometric observation in the ob¬ 
server frame with a combination of a mean SN spec¬ 
tral energy distribution (SED) (scaled by a normaliza¬ 
tion parameter), the first component of the SED vari¬ 
ation (scaled by a shape parameter xi), and the mean 
color variation in magnitudes (which is scaled by a color 
parameter c). The template is also shifted in time to 
match the observations with a date-of-maximum parame¬ 
ter. (In this work, we restrict our attention to the SALT2 
empirical model, which is the best va l idated and most 
widely used such rn odel.) iGuv et ^ (|2007l 120101) and 
IMosher et al.l (I2014D trained the SALT2 model in an ini¬ 
tial (separate) step before the light-curve fitting, using a 
dataset with well-measured spectra and light curves that 
partially overlaps with the SN data used for cosmological 
distances^ After measuring a light-curve shape param¬ 
eter (x°(’®), a light-curve color parameter (c°'^®), and a 
light-curve normalization (m^/, the rest-frame R-band 
flux) 3 we construct a distance modulus estimate as 

+ Mb . ( 1 ) 

a and j3 are the light-curve shape standardization co- 

^ We could, in principle, incorporate the training of SALT2 and 
the fitting of the light curves into our proposed model, but the 
necessary modeling of selection effects would be difficult. 

® Of course, the real observables in astronomy are electrons per 
pixel for each image. We assume here that the photometry, cali¬ 
bration, and light-curve fitting yield an approximate multivariate 
Gaussian likelihood for each SN. 
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efficient and color standardization coefficient, respec¬ 
tively. They quantify the empirical relations that SNe 
with broader rest-frame optical light curves or bl uer rest- 
frame colors are intrinsically more luminous (jPhilliDsI 
Fi^ riTh^Fi^ . S captures residual luminosity cor¬ 
relations with host-galaxy mass, discussed more in Sec- 
tion l2.2l Mb is the absolute magnitude in the rest-frame 
B-band (for a given Hq). All of these coefficients are 
nuisance parameters for cosmological purposes. 

We can then construct a to use for cosmological fit¬ 
ting (for illustrative purposes only, we assume here that 
all of the SN observations are uncorrelated): 

2 ^ cosmo))^ 

^ i (a, /?) + {z) 

^ Mb 

c^obs (a, /3) captures the measurement uncertainties from 
Mb 

the light-curve fit. SALT2 handles fc-corrections implic¬ 
itly, as it is a rest-frame spectral model that fits photom¬ 
etry in the observer frame. Limitations of the SALT2 
model are important to add, such as its inability to si¬ 
multaneously model intrinsic color variation and extinc¬ 
tion with one c parameter (discussed more below). While 
SALT2 does include a simple estimate of the dispersion 
around its model, including this dispersion does not give 
a PSP degree of freedom of 1 when performing cos- 
2 

mological fits. (a, /?) is a term that captures this 

sample-dependent unexplained variance in the SN distri¬ 
bution (sometimes called “intrinsic dispersion”). Finally, 
2 

cr®’'* (z) captures dispersion due to gravitational leasing 
and incoherent peculiar v elocities. In current c osmolog- 
ical analyses (those since iKowalski et al.ll200§l . an iter¬ 
ation is performed between estimating the standardiza- 

2 

tion coefficients, estimating i7®™p (a,/?), and rejecting 
outlying SNe0 

1.2. Limitations of Current Work 

There are several ways in which SN datasets are imper¬ 
fect. Outliers, selection effects, nonlinear correlations, 
partially known uncertainties, and heterogeneity each 
complicate analyses. The proposed framework will ad¬ 
dress each of these complications, but first we discuss 
the limitations of current work. 

Obtaining either spectra or very high-quality light 
curves are the only ways to ensure a transient source 
is a SN la. Even at moderate redshifts, both of these 
techniques are observationally expensive, and non-la SNe 
will inevitably contaminate the sample. A similar issue 
arises if SNe are of type la, but are peculiar, or the red- 
shift is incorrect. The analysis should thus accommo¬ 
date some amount of non-la SNe, which have dissimi- 

® There is a still some amount of confusion in the literature con¬ 
cerning Equation [2] and we address it now to contrast against the 
proposed framework for, e.g., fitting nonlinear {xi, c} standard¬ 
izations. Finding the a and /3 values that minimize this is not 
the same as minimizing the dispersion of the Hubble diagram. Nor 
does it eliminate the correlation between {xi , c} and Hubble dia¬ 
gram residuals. The residuals at the best-fit a and 0 are expected 
to show a correlation with the observed {xi, c}, as uncertainties 
will preferentially scatter the observed values of {xi, c} away from 
the distribution means and it is only the values without this scatter 
that should be uncorrelated with Hubble diagram residuals. 


lar colors, decline rates, and absolute magnitudes. The 
iterative outlier fit described above converges well for 
the sorts of contaminating distrib utions we expect whe n 
the samples are relatively pure (|Kowalski et al.l [200811 . 
When the samples are large (several hundred), or impure 
(< 85%)—conditions the field is beginning to face—the 
outlier s can dominate the other sources of uncertainty in 
the fit. iKunz et ^ (1200711 presented a powerful Bayesian 
technique for simultaneously modeling the distributions 
of normal SNe and outliers, but does not confront many 
complexities of the data, including the luminosity stan¬ 
dardizations and selection effects. 

Selection effects, the tendency for surveys to select 
against the faintest SNe, sculpt the observed distribu¬ 
tion of SNe (this is fr equently referred to as Malmquist 
bias. lMalmauistiri922ll . If not taken into account, this se¬ 
lection will bias both the cosmological parameter estima¬ 
tion and the standardization coefficients. There are SN- 
to-SN variations in the selection efficiency, even within 
the same survey and for the same redshift (e.g., due to 
seeing, host-galaxy contamination, or moonlight). These 
are deterministic but difficult-to-model sources of vari¬ 
ation. Noise also plays a role, as a SN right at the 
detection threshold may stochastically scatter above or 
below it. Thus, the detection efficiency transitions to 
zero smoothly as a function of apparent brightness. In 
most SN cosmological analyses (based on Equation [2]) , 
the treatment of selection effects is performed outside 
the statistical framework, in the form of survey simula¬ 
tions followed by an ad-hoc redshift-dependent adjust¬ 
ment of the to approximately cancel th e estimated 
bias (|Kessler et al.ll20d^ iConlev et al.ll2011h . Bayesian 
analyses face a related challenge: selection effects also 
influence the distribution of light-curve width and color 
with reds hift, which can amplify selection effects if not 
modeled (|Wood-Vasev et aP 1200711 . However, these fre- 
quentist and Bayesian analyses both require knowledge 
of the true population distribution (as a function of red¬ 
shift) to get accurate results. 

The xi and c standardizations in Equation [1] are lin¬ 
ear. However, nonlinear decline-rate and color stan- 
dardizations are statistically justified ([Amanullah et all 
l20ICt l^c olnic et al.l l20I4allH l F^ Our Union2 result 
( Amanullah et al.l l201(T[l ' was derived from subdividing 
the sample by the best-fit latent variables (see Sec¬ 
tion Mil- The subdivisions showed very similar cosmo¬ 
logical results, but subdivision tests are statistically weak 
(statistical uncertainties on the difference are ^ 2 times 
larger than the uncertainty on the whole sample). In¬ 
cluding these nonlinear standardizations in the fit would 
be preferable for evaluating their impact and would re¬ 
move any bias created by the assumption of line arity. 
Un fortunately, neither th e lAmanullah et al.l (|20I0f) , nor 
the iScolnic et al.l (|20I4bll frameworks were able to incor¬ 
porate these nonlinear standardizations in a fully self- 
consistent way. 

Type la Hubble diagrams show more dispersion in dis¬ 
tance than can be explained with measurement uncer¬ 
tainty alone. As noted above, the framework must in- 

Others have seen a trend towards lower c olor-standardization 
coefficients for ev en redder SNe, c > 0.5 IIMandel et al.l 120111 : 
IHurns et al.l 120131 . but virtually all the SNe used for cosmology 
are bluer than this. 
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elude a model of the unexplained dispersion, and should 
not assume that all of this dispersion is in the inde- 

2 

pendent variable (magnitude). In other words, 
is really (a, /?)E3 SN variation (beyond light- 


curve width and color), is presently only crudely mod¬ 
eled by current light- curve fitters and e s pecially affects 
the measured color (iWang et al.l 120091: IChotard et ahl 


2011HFolev & Kasen 

l2miHFolev et a,l.ll2nnHFolevll2m 21: 

Saunders et al.l 12015 

), resulting in color measurements 


that mix extinction and SN differences in only partially 
understood ways. When using Equation [21 there is no 
way to fit for the unexplained dispersion simultaneously 
with the other parameters. Improvement is needed here, 
as the unexplained dispersion interacts with the mod¬ 
eling of selection effects, the outlier rejection, and fit¬ 
ting a and /?. Currently, we are constrained to use 
ad-hoc methods, such as performing many fits with a 
randomized unexplained-dispersion matrix, and comput¬ 
ing the distributions of cosmological parameters fit to 
fit ([Marriner et al.l 1201 It) . The impact of these varia- 
tions is currently subdominant to the to tal uncertainty 
([Kessler et al.1 1201 3t iMosher et al.1120141) , but this may 
not be true in future samples and a precise way to evalu¬ 
ate th e cosmological impact is desirable. iMarch et al.l 
(l2tO took a step in the right direction by using a 
Bayesian hierarchical model to simultaneously fit for cos¬ 
mological parameters and unexplained dispersion, but 
their model cannot accommodate calibration uncertain¬ 
ties, outliers, nonlinear standardizations, or selection ef¬ 
fects. 

Future SN cosmology analyses will confront additional 
difficulties, such as heterogeneity. Even very homoge¬ 
neous S N imaging surveys (such as the Dar k Energy 
Survey, lFlaughei]l2005t [Bernstein et al.ll2012D will gain 
heterogeneity as expensive observational resources, such 
as near-IR measurements or high-quality spectroscopy, 
will only be available for a subset of objects. A fre- 
quentist, “object-by-object” uncertainty analysis such as 
Equation [2| cannot make efficient use of this informa- 

tionH 

1.3. Desired Properties for a New Approach 

In light of these problems, this paper offers an im¬ 
proved technique incorporating a more sophisticated, 
Bayesian model of the data. Properly making use of het¬ 
erogeneous information (like measurements that are only 
available for a subset of objects) requires a model of the 
SN population in which the parameters of the distribu¬ 
tion are treated as unknowns. In addition, the model of 
the unexplained dispersion should allow for uncertainty 
in both size and functional form. We discuss the details 
of our procedure for parameterizing these possibilities in 
Section 12.71 Only a Bayesian framework can accommo¬ 
date more exotic possibilities, such as very large numbers 
of nuisance parameters (each possibly non-zero and hav- 

The SALT2 light-curve fitter already incorporates a signifi¬ 
cant amount of model uncertainty in both the color and magnitude 
measurements. 

The Union2.1 analysis IlSuzuki et al.l 1201211 already uses a 
simple Bayesian model for the host-galaxy mass standardization, 
where samples which lack host-mass measurements are given pri¬ 
ors derived from other similar samples, but the remainder of that 
analysis uses a frequentist framework. 


ing initially unknown size), as it can make use of a hierar¬ 
chical prior around zeroE3 One could imagine many pos¬ 
sibilities for these parameters (e.g., color-standardization 
coefficient, /3, as a function of host-galaxy-inclination an¬ 
gle) ; we do not pursue these in this work, but we do note 
that they could be built into the general framework. 

However, not all of the improvements we propose re¬ 
quire Bayesian statistics; some are due to the improve¬ 
ments to the data likelihood (and could thus fit into 
a frequentist framework). First, we introduce a mod¬ 
ified approach to fitting for residual correlations with 
host-galaxy mass, described in Section 12.21 Second, like 
iKunz et (|2007D . we handle outliers with a mixture 
model, described in Section 12.31 which offers improved 
robustness. Third, to account for selection effects, our 
proposed framework uses a probabilistic model of selec¬ 
tion as described in Section [2^ modifying the classical 
likelihood to include selection directly. This cleanly esti¬ 
mates and marginalizes the hyperparameters of the true 
population distributions simultaneously with other pa¬ 
rameters, propagating all uncertainties. Fourth, in Sec¬ 
tion [531 we discuss our approach to fitting for the chang¬ 
ing SN independent variable distribution with redshift 
(population drift) simultaneously with other parameters. 
Finally, our new framework can accommodate nonlinear 
standardizations (Section [23]) . A frequentist framework 
can, in principle, include nonlinear standardizations, but 
it is computationally challengingF^ 

Both the old and new techniques can handle corre¬ 
lations due to systematic uncertainties (such as corre¬ 
lated peculiar velocities and photometric calibration un¬ 
certainties), assuming the sizes of these uncertainties are 
knownThe modifications to t he old framework to in ¬ 
clude correlations are d e tailed iniKowalski et al.1 (I2008D: 
lAmanullah et al.l (|201(T[) : [Guv et al.l (|20ir)( ): lConlev et al l 
(I2011D . We implement these correlations with nuisance 
parameters, as discussed in SectionThe model must 
also allow for population drift (whether due to selection 
effects, or changes in the SN population with redshift), or 
risk a significant bias on cosmological parameters. Pop¬ 
ulation drift can also be handled well with either frame¬ 
work, as Equation [2] handles each object independently 
(if the measurements are uncorrelated object to object) 
and our new framework solves for population drift simul¬ 
taneously with other parameters (Section [2^. 

There are two disadvantages to our approach. First, 
the more sophisticated model requires more CPU time 
(now measured in many hours, rather than minutes). 
Second, it is not necessarily possible to generate a unique 

In that kind of Bayesian hierarchal model, the width of the 
prior around zero is incorporated as a hyperparameter in the fit. 
The prior will smoothly weaken as the statistical evidence builds 
that some of these nuisance parameters are non-zero. If there is no 
evidence that any of these parameters are non-zero, the prior will 
remain clustered around zero. 

Note that the likelihoods for the latent variables can easily 
become multi-modal for nonlinear standardizations, as a curved 
line may approach a datapoint more than once. For this reason, 
it is more practical to approach the problem with a technique like 
MCMC than a derivative-based minimizer. The lead author has 
had success in a frequentist framework only by testing each local 
minimum for each latent variable and recording the best one. 

There is a benefit to a Bayesian approach if the systematic un¬ 
certainties have unknown sizes, and these sizes must be marginal¬ 
ized over. 
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distance modulus for each SN as in Equation [TJ The 
true-independent-variable priors (Section 12.51) must be 
treated in bins smaller than the other parameters of in¬ 
terest (e.g., bins of < 0.1 in redshift for cosmological 
parameters). However, no one set of assumptions will 
exactly cover all use cases (e.g., tests for isotropy). Ap¬ 
proximate sets of SN distances are possible, but we leave 
this for future work. 

2. PROPOSED METHOD 

Our proposed framework surpasses previous analysis 
efforts by bringing together components that simultane¬ 
ously address each of the limitations discussed above. 
We call our framework UNITY (Unified Nonlinear In¬ 
ference for Type-la cosmologY). The required model 
needs to contain thousands of nonlinear fit parameters 
(as motivated in the following sections), which poses 
a problem for many statistical techniques. For exam¬ 
ple, there is no practical way to analytically marginal¬ 
ize over every parameter except the parameter (s) of in¬ 
terest. We thus must draw random samples from the 
posterior distribution (Monte Carlo sampling), and use 
those samples to estimate the posteriors. A natural tool 
for this sampling is Hamiltonian Monte Carlo (HMC). 
HMC samples efficiently, with short correlation lengths 
sample to sample, even for large n umbers of fit param¬ 
eters . To t his end, we use Stan (|Hoffman fc GelmanI 
1201 IL 12014 iBetancourd I2013D through the PyStan in- 
terfac ^Stan Development Team! 12014) . Stan automat¬ 
ically chooses a mass matrix, and speeds up sampling 
efficiency even further by using a variant of HMC sam¬ 
pling called the “no-U-turn” sampler. Stan also incorpo¬ 
rates automatic, analytic differentiation for computing 
the gradient of the log-likelihood, making the implemen¬ 
tation of the model simple and readable. We show a 
Probabilistic Craphical Model of our framework in Fig¬ 
ure [H and show a table of parameters in Table [T] 

2.1. Parameterization of the True Position on the 
Standardization Relation 

innii (fi^ offers an excellent discussion of linear re¬ 
gression with error bars in both dependent and indepen¬ 
dent variables. We briefly summarize here. Consider the 
case of htting a straight line (slope A, intercept B) in two 
dimensions {y, the dependent variable vs x, the indepen¬ 
dent variable) with uncertainties in both x and y {jTx and 
<7y^ assumed uncorrelated and Gaussian for simplicity). 


case there are (Number of Observations) x (Number of In¬ 
dependent Variables) latent variables. For the standard 
Xi and c standardizations, this results in 2VsNe addi¬ 
tional parameters FI 

In a frequentist analysis, if measurement uncertain¬ 
ties are believed to be Gaussian and the xi and c stan¬ 
dardizations are linear, the likelihood can be analytically 
maxi mized for each of these parameters (iKowalski et al.l 
1200811 . This technique enables the proper handling of 
these nuisance parameters without explicitly including 
them in the fit. Similarly, in a Bayesian analysis, if the 
measurement uncertainties are Gaussian (or a Gaussian 
mixture), the standardizations are linear, and the priors 
on these parameters are flat or Gaussian (or a Gaussian 
mixture), thes e nuisance parameters can be analytically 
marginalized ()Gulll[T98^ . providing a similar computa - 
tional efficiency boost (as was done in iMarch et al.l[2011ll . 
In this work, we violate these assumptions; therefore, we 
must keep these 2VsNe nuisance parameters in the fit 
(xf"® and cf"®). 

2.2. Host-Galaxy Environment Standardization 
Relation 

iKellv et al.l 1)2010(1 first found evidence that Hubble 
residuals in current light-curve fitters are correlated with 
host-galaxy environment. This finding lat er reached high 
statistical significance in larger samples (|Sulhva n et al.l 
l2010t iLamneitl et ahll^OlOt iGhildress et alJ 1201,1(1 . The 
current method for including this effect in cosmolog¬ 
ical analyses fits two separate absolute magnitudes 
for low-mass-ho sted (M^, < 10^°Mr:^l and high-mass - 
hosted galaxies (ISulhvan et al.ll2010l : iSuzuki et al.ll20l3l . 
iRjgault et al.l (|201.'1[ 1 found evidence that most of this ef¬ 
fect is due to the ag e of the progenitor sy stem (see also 
indirect evidence in IGhildress et al.l 120131). Th is effect, 
confirmed independently bv iRigault et al.l(|2015ll , implies 
that SNe hosted by high-stellar-mass galaxies may be¬ 
come more like low-stellar-mass-hosted SNe at higher 
redshift (when progenitors were young in all galaxies). 
However, newer versions of SALT2 combined with dif¬ 
ferent sample selections may not show as st rong an age 
effect (jRigault et al.ll20l1 : 1 Jones et al.ll2015ll . 

We therefore do not assume a constant mass- 
standardizati on (S)\ instead we us e a modified version of 
th e model in IRjgault et al.l (1201,80 (similar to the model 
of IGhildress et al.ll2O140 


The generative model for an observed y (“y°bs”^ jg 


= S{z) 



yObs 

^Ar(y*™®, al) 

(3) 

= m 

(l s(o)) d{oo) 

• (5) 

and similarly for 



0.9 -f 100-95^ (5(0) 

x°'^^ - 

^Ar(x‘™®, al), 

(4) 

Those authors’ proposed host-mass-standardization 


where x*“'“® and are the value for the measurement if 
no uncertainty is present, and y*™® = B. When 

there is no significant uncertainty in x, Equation|4]is un¬ 
necessary, and we have a simple least-squares regression. 
When uncertainty is present in both variables, we can 
substitute for y*’’”®, but a:*"'“® remains. For a fit with two 
independent variables, there are two latent variables rep¬ 
resenting the true position on the line (now a plane). The 
same logic holds for more than one observation, in which 


evolution predicts the mass-standardization coefficient 
will approach zero at high-redshift; we instead assume it 
smoothly approaches a possibly non-zero quantity, (5(oo). 
We take a flat prior on ^(oo)/(5(0) from 0 to 1, allowing 

If the redshifts of the SNe are also unknown, the analysis 
will instead face SA^sNe parameters. Because of the potential of 
misassociation, this situation might be faced by surveys measuring 
many host-galaxy redshifts after the SNe have faded, such as the 
Dark Energy Survey. 
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Fig. 1. — Probabilistic Graphical Model of our framework showing the causal links. An edge from one node (e.g., flm) to another (e.g., 
means that the latter is conditional upon the former (e.g., is conditional on flm)- The enclosed nodes represent variables that 

are sampled in the MCMC. Global parameters are in orange nodes (single parameters) and red nodes (the set of systematic uncertainty 
parameters). Green nodes enclose the hyperparameters (parameters of a prior distribution) of the latent-variable priors, and the singly 
outlined black nodes show those latent variables. Blue nodes show sample-dependent quantities. Finally, the outlined nodes show the 
observed light-curve fits. (Each of depends on {m^l^®, c**'^®} as the light-curve fit and unexplained dispersion 

have correlated uncertainties.) i ranges over each SN, j ranges over each SN sample, k ranges over the coefficients in redshift within a sample, 
and I ranges over each systematic uncertainty (e.g., calibration). Note that the are completely determined by other parameters and 

are not true fit parameters. We fix the selection effect parameters, and the outlier distribution width so these 

are represented filled nodes. 


TABLE 1 

Parameters in our model. 


Parameters 

Symbols 

Section 

Absolute Magnitude for h = 0.7 

Mb 

o 

Cosmological Model (Flat ACDM) 


o 

Latent Variables 

r,..true ^true 

o 

Host-Mass Standardization Coefficients 

<5(0), <5(oo) 

[T21 

Outlier Distribution 

^outl ^outl 

[23] 

Sample Limiting Magnitudes (fixed) 

^cut rtCUt ^cut 

j ’ j ’ j ’ j 

El 

Latent-Variable Hyperparameters 

d®, /3^, a®, 

El 

Light-Curve Color and Shape Standardization Coefficients 

ED 

‘Sample” (Unexplained) Dispersion 

^samp pSamp 

ED 

Systematic Uncertainties 

AsyS; 

12.81 


Note. — The final column displays section references for the parameters in our model. 


the mass standardization to be constant or declining with 
redshift, and spanning all of the claims in the literature. 

2.3. Non-la or Other Outlier Contamination 

For non-la contamination, we use the mixture-model 
framework of iKnnz et al.l (j2007[l . This framework mod¬ 
els the observed distribution around the modeled mean 
as a sum of Gaussians, where at least one Gaussian (the 
normal la distribution) is tightly clustered, and an out¬ 
lying distribution is comparatively dispersed. Although 
the assumption of a Gaussian contaminating distribu¬ 


tion is a strong one, it makes little difference in practice. 
As the outlying distribution is broader than the inlying 
distribution, any outlying point will be treated as an out¬ 
lier. For the relatively pure spectroscopically confirmed 
SN datasets in use today, modeling the outlying distri¬ 
bution accurately has little impact on the rest of the 
parameters in the model. Because of this, and as a test 
of the framework, we perform our fits assuming an out¬ 
lier distribution that is a unit multinormal = 1) 

in {mB^ Xi^ c} (centered on the la distribution for that 
redshift), which is different from the simulated data we 
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generate to test the framework (Section [3|). 

The relative normalizations of the core and outlying 
distributions can be chosen object by object (from spec¬ 
troscopic or other classification evidence), or set to be the 
same for every object. We fit for the fraction of outliers 
assuming it is the same for all objects (/°“*^), and place 
a broad log-normal prior on this quantity of —3±0.5 (an 
outlier fraction of exp(—3) = 0.05 plus or minus 50%). 
These assumptions work well with Union2.1, as discussed 
in Section K7\ but of course they can be adjusted for 
other datasets. 

If the luminosity distribution of SNe la turns out to 
be signifi cantly non-Gaussian (for example, the bimodal 
model of iRigault et al.ll201^ . additional Gaussian com¬ 
ponents can be added (with redshift-dependent normal¬ 
izations) to give smaller uncertainties and capture pos¬ 
sible population drift. We leave this and more complex 
models of the non-la distribution for future work, but 
these extensions fit easily into this framework. 


2.4. Selection Effects 

We present the details of our selection model in Ap¬ 
pendix II but outline the important points here. The 
standard method for incorporating a selection cut is to 
truncate the data likelihoo d at the cut and d ivide by 
the selectio n efficiency le.g.. iGelman et aT]l2013L see also 
IKellvl[20^ for a discussion of selection effects and non¬ 
detections in the context of linear regression). In SN cos¬ 
mology, the truncation is not sharp, but is instead prob¬ 
abilistic, as discussed in Section lOl We assume that the 
observation likelihood is truncated by an error function. 
Far from the selection limit, a SN is found or missed 
with probability one or zero; for SNe near the selec¬ 
tion limit, the probability transitions smoothly. An error 
function reasonably matches the efficiency curves of e.g., 
iDild^ et_alJ_ ((20^ ■ iBarbarv et all ( 20ljl: |Perrett et a.lJ 
(|2012ll : IGraur et al.l (|20I4ll : IRodnev et alJ (1201411 FI 

Surveys also do not select only on one measured vari¬ 
able. We assume our cut is a plane in three-dimensional 
space, spanning the dependent variable and both inde¬ 
pendent variables. In our example, this is magnitude 
(SALT2 tob), shape (SALT2 xi), and color (SALT2 c); 
SNe with niB + -I- b™^c > are less than 50% 

likely to be found (and more than 50% likely to be found 
if < The width of the cut is SNe observed 

at ± are considered I6%(-|-) or 84%(—) likely 
to be found, tob and c are the primary variables respon¬ 
sible for selection effects in SN searches. For example, 
SNe found in the rest-frame R-band have a limit in ms 
^^cut _ while SNe selected in the rest-frame R-band 
have a limit in ttib — c ~ my or 5'^“* = —I. We note 
that for selection in just tub, bluer (more-negative c) 
and slower-declining SNe (larger xf) will be selected, as 
these correlate with brighter niB- That is, the only ef¬ 
fect that a magnitude-based {ttib, c} selection ignores is 
that slower-declining SNe are more likely to be found, 
irrespective of the maximum brightness, as they stay 
above the detection threshold longer. A simple simu¬ 
lation shows that this effect is very small compared to 


An error-function truncation rapidly approaches 100% effi¬ 
ciency on the bright side of the cut. Real surveys are not as ideal, 
and will asymptote short of 100%. As this asymptote has little 
brightness dependence, it does not impact the selected population. 


selection on {ttib, c}, even for cadences as large as ten 
rest-frame days. Another bias related to selection effects 
is the bias due to larg er uncertainties on fainter SNe (e.g., 
IKowalski et al.l[2008f) . Simple simulations show that our 
Bayesian framework has much less susceptibility to this 
bias, and the uncertainty bias is much smaller than the 
one due to missing faint SNe altogether. 

2.5. Priors on True Values of Independent Variables 

As this is a Bayesian framework, we must s elect priors 
on the true xi and c latent parameters Isee iGull 119891 
for a discussion of Gaussian priors, and iKellvl 120071 for 
a Gaussian-inixture prior). These priors must be chosen 
very carefully. If the prior mean is wrong, then every dis¬ 
tance will also be incorrect in a correlated direction. The 
variance of the prior has an impact as well. If the prior 
variance is larger than the population variance, then the 
true latent parameters will be scattered about the mean 
more than they should, and the slope of the line will be 
biased towards zero. The converse will bias the slope of 
the lines away from zero. The mean and variance of the 
prior are the most important parameters to estimate ac¬ 
curately, thus Gaussians are normally adequate. In SN 
cosmology, these priors must also be redshift-dependent 
as the SN population can drift with redshift. 

The optimal way to ensure the proper size and redshift- 
dependence of the priors is to fit for the prior parame¬ 
ters (the “hyperparameters”) simultaneously with every 
other parameter. We selected skew normal distributions 
for the c priors (allowing the distribution to be skewed), 
and Gaussians for the xi priors. The prior must be able 
to vary in redshift more rapidly than the cosmological 
fit in order to not introduce a bias. What we propose 
here more than meets this mild requirement F3 but there 
is no harm in allowing the hyperparameters to mimic 
more closely the redshift dependence of the xi and c 
distributions. For each sample, we fit for the mean of 
the distributions as a function of redshift with a linear 
spline. We use up to four spline nodes (the x*j/^ and 
c*/., for xi and c, respectively), equally spaced in red¬ 
shift over the range of a sample, with linear interpola¬ 
tion between these nodesF3 We take non-informative 
flat priors on the means of the distributions and on the 
log of the standard deviations (the standard deviations 
for each sample j are and For the shape pa¬ 
rameter of the skew normals, we take a flat prior 

on a/1 + (q;®“^)^, which is also allowed to 

vary in redshift in the same way as the distribution mean. 
This prior forces the skew normal to approach a Gaussian 
for samples with few objects. (The superscript “S—N” 
here is used to distinguish these skew-normal variables 
from the SN standardization coefficients.) 

For simplicity, we assume that the true xi and c distri¬ 
butions are uncorrelated (the observed distributions of xi 

In contrast, the constant-in-redshift priors of (March et al.l 
1(20111 ) do not meet this requirement. 

Specifically, we use two nodes (a single line segment) for 
datasets with fewer than 30 SNe; we use three for 30-39 and four 
when the number of SNe is at least 40. This improves the robust¬ 
ness of the determination of the independent variable standard 
deviations. 

For a skew normal distribution, the mean and standard devi¬ 
ation are not the same quantities as /r and a. 
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and c show little correlation). We do note that ignoring 
correlations will bias the fit if any signihcant (jpl > 0.2) 
correlations are present. 

2.6. Nonlinear Independent Variable standardizations 

With the true values of the independent variables ex¬ 
plicit in the model, it becomes trivial to have nonlinear 
standardizations. For this work, we suggest a broken- 
linear relationship, allowing red/blue and smalh/large- 
Xi SNe to have different size standardization coefficients 
and ja^, respectively). We take a flat prior 
on the angle of each line segment, but transform to the 
average slope and the difference in slopes for display pur¬ 
poses: a = (a^-|-a‘^)/2, da = a^ — a°, /3 = (/?^-|-/3^)/2, 
and dp = — P^. Although the xi and c values of the 

break could be fit parameters, we do not do this for our 
primary fits. For the moment, we split the sample at xi 
and c of 0. 


and correlated peculiar velocities. In standard analy- 
ses, these effects are pro p agated into a cov a riance matrix 
(lAmanullah et all 120101 : I Guv et ahl 120101 : iConlev et al.l 

In order to speed up the Monte-Carlo sampling, we 
leave e ach correlating factor explicit as a parameter (sim¬ 
ilar to lKowalski et al.ll2008ll . Asys; (where I ranges over 
all systematic uncertainties), while leaving the data un¬ 
correlated SN to SN. These two approaches coincide 
exactly for a linear mo del with Gaussian uncertainties 
(|Amanullah et all I2010f) . The Asys; parameters cap¬ 
ture the deviations of a measured quantity, like a zero- 
point or a filter bandpass, from the estimated value. For 
each quantity, we numerically compute the derivative of 
the light-curve fit with respect to that 

quantity, giving each dm'j^l/dAsySi, dx°^^/dAsySi, and 
dc°^^/dAsySi. This lets us marginalize out Asys;, with a 
Gaussian prior around zero set by the estimated size of 
each systematic uncertainty. 


2.7. Unexplained Dispersion 

As the unexplained dispersion is parameterized in the 
model, it can be marginalized. We do not know what 
functional form to assume, so we use a flexible param¬ 
eterization. Each SN sample is allowed its own unex¬ 
plained dispersion, allowing poorer-quality samples to be 
naturally deweighted. We must also distribute the unex¬ 
plained dispersion over ms, xi, and c, while accounting 
for possible correlations. 

First, we split the variance of the unexplained disper¬ 
sion into and /‘^ (the fraction of the unex¬ 

plained variance in tub, xi, and c, respectively), which 
are constrained to sum to one. Then, we scale each 
of these by 1, 0.13“^, and (—3.0)“^. The values 0.13 
and —3.0 approximately scale out a and —/3, respec¬ 
tively, where the negative sign for P corresponds to the 
sign convention in Equation [TJ (Note that using a and 
—/? directly would cancel the a or P dependence when 
computing a marginalized distance uncertainty for each 
SN, so this would be inappropriate.) We also scale the 
variance by the unexplained dispersion for each sam¬ 
ple, Finally, we form a covariance matrix out 

of this {nriB, Xi, c} unexplained variance, allowing the 
off-diagonals to be scaled by parameters as fol¬ 

lows 


/ 

samp 

el2 0.13 

samp f ° 

\ Pl3 -3.0 


samp 

^12 0.13 

/"i 


07 3 ^ 

samp / f^i £c 
^23 (-3.0)(0.13) 


samp /<= \ 

ei3 -3.0 \ 

samp samp 

f^23 (-3.0)(0.13) U 

r / 

(-3.0)^ / 


( 6 ) 

We take a non-i nformative “LKJ” prior o n the correla¬ 
tion distribution (|Lewandowski et ^1200911 . with rj = 1, 
as well as a flat prior on logcr®™^. 


2.8. Treatment of Correlated SN Observations 

Many effects result in correlated measurements in SN 
cosmology. The most notable such effect results from 
common calibration paths: SNe from a given dataset 
share systematics such as telescope bandpass and pho¬ 
tometric calibration uncertainties. Other effects include 
correlated uncertainties in Milky Way extinction maps. 


3. SIMULATED DATA TESTING 

Our analysis framework must pass through careful test¬ 
ing using simulated data before it can be applied to real 
data. To this end, we generate thirty simulated datasets 
that incorporate many characteristics of real data. As 
our analysis takes the SALT2 light-curve fits as inputs, 
this is the level at which we generate simulated data. 

• We generate four simulated datasets spanning the 
redshift ranges 0.02-0.05, 0.05-0.4, 0.2-1.0, and 
0.7-1.4. 

• Each simulated dataset has 250 SNe, except the 
highest-redshift, which has 50. 

• We generate the xi population from a unit normal 
distribution, centered on zero. 

• We draw the population c values from the sum of 
a Gaussian distribution of width 0.1 magnitudes, 
and an exponential with rate 1/(0.1 magnitudes). 
We center the distribution on zero. 

• We assume that the unexplained dispersion covari¬ 
ance matrix is correct in SALT2, and that only 
dispersion in rriB (gray dispersion) remains. The 
statistical model does not have access to this in¬ 
formation, and fits for the full unknown matrix, 

2 overestimating the uncertainties on xi and c, and 
thus slightly biasing a and P away from zero (see 
Section [2T5|) . (This is not a unique problem for our 
framework; the old technique would have the same 
bias.) 

• We assume that the uncertainties on m^, xi, and 
c are 0.05, 0.5, and 0.05, and are uncorrelated. In 
addition, we take 0.1 magnitudes of unexplained 
dispersion, O.OOSz ma gnitudes of lensing dispersion 
(|Holz fc Linded[2005ll . and 300 km/s peculiar ve¬ 
locity uncertainty. All are approximated as Gaus¬ 
sian and independent SN to SN. 

• a and P are assumed to be constant, with values 
0.13 and 3.0, respectively. Mb is set to —19.1 and 

is set to 0.3 (flat ACDM model). 
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TABLE 2 

Summary for primary simulated data runs. 


Parameter 

Input Value 

Fitted Value 
± Uncertainty in Mean 

a 

0.130 

0.143 ± 0.004 

d 

3.000 

3.076 ± 0.016 

Mb 

-19.100 

-19.117 ± 0.003 


0.300 

0.298 ± 0.005 


Note. — Average over thirty simulated datasets. 
These results show an expected bias towards larger values 
of a and ^ (discussed in Section[^, but shows no sig¬ 
nificant bias. Other simulated data fits are also described 
in Section!^ 


• For the host-mass relation, we always take (5(0) 
to be 0.08 magnitudes, and select (5(oo)/(5(0) uni¬ 
formly from the range 0 to 1. 

• We assume 3% of the SNe are outliers, and draw 
their observed distribution centered around the 
normal la for that redshift, and around zero 
in xi and c. The spread is 1, 2, and 0.5 in tob, xi, 
and c respectively, and we assume these distribu¬ 
tions are Gaussian and uncorrelated. 

• Each sample has zeropoint uncertainties of size 
0.01, 0.01, 0.01, and 0.02 (highest-redshift sample) 
magnitudes. The uncertainties are taken to be in¬ 
dependent sample to sample. 

• The datasets have selection effects in tub, with 
width 0.2 magnitudes. The selection cuts are cho¬ 
sen for 50% completeness at redshifts 0.08, 0.25, 
0.6, and 1.45. Note that this selects from the 
population Xi and c distributions in a redshift- 
dependent way. (We randomly draw from the pop¬ 
ulation distributions and pass them through the 
simulated selection effects until the required num¬ 
ber of SNe are generated.) 

• We assume the redshift distribution of SNe scales 
linearly with redshift (starting from the minimum 
redshift of each sample). This is quantitatively in¬ 
correct (the real scaling will depend on the cos¬ 
mological volume, cosmological time dilation, and 
the SN rate), but does produce SN samples where 
many SNe are removed by the magnitude cuts, al¬ 
lowing a good test of our selection effect modeling. 

The redshift distribution of the four simulated datasets 
are shown in Figure[2]before selection effects (gray-tinted 
histograms) and as observed. We show the observed color 
distributions in Figure [H This figure shows outliers in 
gray (knowledge that the statistical framework does not 
have access to), and the general trend towards bluer SNe 
at higher redshift due to selection effects. Figure|l]shows 
the Hubble diagram residuals from the input cosmology 
with the best-fit ignoring selection effects (O^ = 0.32) 
also shown. 

After generating thirty complete sets (with all four SN 
samples), we supply the generated SALT2 result files to 
the framework. Table [5] summarizes the key results. As 
expected, o;, j3 show bias away from zero, but any bias 
on is small. 



Redshift 


Fig. 2.— Typical simulated redshift distribution for the four sim¬ 
ulated samples in each dataset. The gray-tinted histograms show 
the population before selection effects. The middle two samples 
are essentially magnitude-limited. 



Fig. 3.— Typical simulated observed color distribution for the 
four simulated samples in each dataset. The gray points are out¬ 
liers. 


We also run some simulated datasets fitting for both 
Sim and the Dark Energy equation of state parameter 
w (assumed to be constant in redshift) E3 We see no 
evidence of bias on w, and mild evidence of bias on the 
mean Sim: 0.274 ±0.015. 

4. REAL DATA DEMONSTRATION 

In this penultimate section, we demonstrate our frame- 
work on real data , namely, the Union2.1 compilation 
(jSuziiki et al.ll201^ . This compilation is a useful dataset 
for demonstrating the impact of the more-sophisticated 
analysis, as Union2.1 provides light-curve f its for outliers 
(the newer Joint Light-Curve Analysis, iBetoule et al.l 

Arguably, the correct priors to use for this model are flat pri- 
ors on kinematic cosmological quantities like qo and jo; these priors 
would better preserve the Gaussian SN likelihood. The cosmologi¬ 
cal results are similar with flat priors on Qrn and w, so we use flat 
priors on these parameters for simplicity. We constrain ftrn to be 
between 0 and 1, and w to be between —2 and 0. 
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Fig. 4.— Mean simulated-data Hubble-diagram residual from 
= 0.30 (in the sense of the numerator of Equation 0 for a 
large number of non-outlier points, plotted against redshift. The 
effect of the magnitude limits is clearly visible, and we overplot 
= 0.32-the best-fit ignoring these selection effects. 

12014 did not publish these SNe). Our cosmological 
hts include flm and assume a flat universe. This fit 
is qualitatively similar to the assumption of a constant 
equation-of-state parameter w including a CMB or BAO 
constraint, in that both fits probe the deceleration pa¬ 
rameter qq. As fitting ftm only requires SN data, it is a 
cleaner analysis for our purposes here. 

In order to identify the effect of each feature of the 
analysis on the inferred results, we incrementally transi¬ 
tion from the original Union2.1 frequentist framework to 
the analysis proposed in this work. The results of each 
step are shown in Figure O We conducted this part of 
the analysis blinded, using real data only after the code 
was validated on simulated data. The initial version of 
UNITY required the unexplained dispersion in ms to 
be fixed, which the improved selection effect model now 
presented in Appendix [B] does not require. With the 
improvements in place, after a second round of blinding¬ 
unblinding, we found only a small change in 17^ between 
the two versions, 0.009 (0.2 ct). 

4.1. Frequentist Union2.1 Analysis 

First, we show the results of a freque ntist calcula¬ 
tion, based on the same assumptions as LSnzuki et al.l 
(I 201 I and using its 580 SNe (top line in Figure [5]) 
with SALT2 light-curve fitsEB All systematic uncertain¬ 
ties from the covarian c e mat rix are included. To repro¬ 
duce the LSnzuki et al.l (|20I2h results, we include only a 
redshift-independent host-mass standardization0 As a 
cross-check, we also try a hybrid frequentist/Bayesian 
model, in which the from Equation [2] is converted 

into a likelihood as e~^ and then Mb, a, /?, and S are 
marginalized ove r (this is similar to the method used in 
iKnop et al.ll200,lll . We obtain essentially the same results 
as a purely frequentist fit. 

Union2.1 uses the SALT2-1 version of SALT. 

As the lSuzuki et all II2012I) cosmology fits including system¬ 
atic uncertainties fixed a and /3 for computational efficiency, our 
new results are very slightly different: 0.001 in the Xlm confidence 
interval. 


§4.1 

Union2.i (F) 

§4.2 

Union2.1 (B) 

§4.3 

+ Redshift-Dependebt Mass Step (F) 

§4.4 

+ Redshift-Dependent Mass Step (B) 

§4.5 

+ 3D Unexplained; Dispersion (B) 

§4.6 

+ Non-Linear Corrections (B) 

§4.7 

+ Outlier Model (B) 


-•- 

§4.8 

+ Selection Model (B) 


0.24 0.28 0.32 0.36 

(flatACDM) 


Fig. 5.— Cosmological fit for each analysis. The frequentist con¬ 
fidence intervals show the best-fit (red squares) and the Ax^ = 1 
boundaries (red lines). The Bayesian credible intervals show the 
median of the posterior (black circles) and the 15.9 and 84.1 per¬ 
centiles (black line ranges). The left margin gives the section num¬ 
ber in the text in which each variant is discussed. 


4.2. Bayesian Model with Same Data 

For the next step, we keep all data the same, but tran¬ 
sition to a Bayesian model for the data (the credible in¬ 
tervals are shown as the second line in Figure [5]). This 
model includes the SN population terms (described in 
Section 12.51 and necessary for a Bayesian analysis) and 
the Union2.I systematic uncertainties, but does not in¬ 
clude our proposed treatment of selection effects, out¬ 
liers, multi-dimensional unexplained dispersion, or the 
redshift-dependent host-mass standardization. Other 
than the type of inference and these differences, this fit 
is identical to the first fit. The error bars shrink by 5%, 
but the central value changes very little. As the data 
are the same for this fit and the last one, the gain in 
statistical power comes from the ability of a Bayesian 
hierarchical model to make better use of heterogeneous 
informationF^ 

A Bayesian hierarchical model and a frequentist line fit will 
give exactly the same slope and intercept if: all uncertainties are 
known and Gaussian, every measurement has the same uncertain¬ 
ties (homoscedasticity), the Bayesian hyperpriors are Gaussian, fiat 
priors are taken on all other parameters, and the slope and inter¬ 
cept are evaluated in the Bayesian model at the hyperparameter 
posterior maximum. These methods will give different results if 
the measurement uncertainties are heteroscedastic. For example, 
if there is a dataset with shape and color measurements but no 
SN has both measurements. Equation 0 will have no constrain¬ 
ing power, but a Bayesian model will. The difference between the 
Bayesian and frequentist Union2.1 fits is thus a measure of how 
inhomogeneous the uncertainties are within a given redshift range 
for each SN sample. 
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4.3. Frequentist Redshift-Dependent Host-Mass 
Standardization 

Next, we return to the frequentist framework, but in¬ 
clude the redshift-dependent mass standardization as de- 
scri bed in Section|2.2lft hird line in Figure [5]). We remove 
the lSuzuki et al.l (120121) covariance matrix term that cor¬ 
responds to systematic uncertainty on the mass stan¬ 
dardization. The best fit shifts to a higher (brighter 
standardized magnitudes on average at high redshift) be¬ 
cause the high-host-mass half of the high-redshift SNe 
has less standardization to fainter magnitudes. This step 
can be conducted through frequentist or Bayesian infer¬ 
ence, and we show both analyses. 

4.4. Bayesian Model with Redshift- Varying Host-Mass 
Standardization 

In this next fit, we continue with the same data and 
model as the last subsection (redshift-dependent host- 
mass standardization), but again transition to Bayesian 
inference, including the population terms (fourth line of 
Figure [S|). The results are similar to the frequentist re¬ 
sults, but the Bayesian fit is more agnostic about the 
value of 6{qo) (essentially unconstrained), so the fit shifts 
less than the frequentist one to higher rim- 

4.5. Bayesian Model with Unexplained Dispersion 

Remaining in the Bayesian framework, our next ad¬ 
dition is the multi-dimensional unexplained dispersion. 
We first remove the existing Union2.1 unexplained dis¬ 
persion (which is only in ms). By effectively increas¬ 
ing the uncertainties on the color, we increase the color- 
standardization coefficient (3. Thus the color standard¬ 
ization now moves bluer (bluer due to selection effects) 
high-redshift SNe fainter, decreasing the fitted (the 
fourth line from the bottom in Figure [5]). 

4.6. Bayesian Model with Nonlinear Standardizations 

We now take advantage of our explicit and 
values to include nonlinear color standardizations param¬ 
eterized by da and dfd. We remove the Union2.1 covari¬ 
ance terms that describe color-standardization system- 
atics (between the multivariate unexplained dispersion 
and the nonlinear standardizations, these systematic un¬ 
certainties are likely much lower). The color standard¬ 
ization is now strongly nonlinear (discussed further in 
Section HU), with redder SNe requiring a larger coeffi¬ 
cient than bluer SNe. This moves the fitted flm (third 
line from the bottom in Figure [5]) back in the opposite 
direction from the previous step. 

4.7. Bayesian Model with Outliers 

Next, we include in the fit all twelve outlier SNe re¬ 
moved by the Union sigma clipping in Union2.1 (a new 
total of 592). Instead of excluding these, we add our mix¬ 
ture model for handling outlier SNe. We also remove the 
Union2.1 systematic uncertainties on outlier rejection. 
The results are quite similar to the previous step, in¬ 
dicating that the Union sigma-clipping worked well with 
the 2% contamination that was present (second line from 
the bottom in Figure [5]). 

4.8. Bayesian Model with Selection Effects 


Finally, we model selection effects. For simplicity, we 
approximate all selection as occurring in rest-frame B- 
band (a™* = 0, = 0), and leave a more detailed 

analysis to future workl^ For many of the most impor¬ 
tant samples in Union2.1, this is not a bad approxima- 
tion. For example, th e Sloan Digital Sky Survey SNe 
( Holtzman et al.ll2008D were selected in g, r, and i-band 
( Dildav et al.ll2008D . At redshift ^ 0.4 (the distant end 


of the survey), r-band corresponds to rest-frame U-band. 
The S upernova Legacy S urvey SNe were selected in i- 
band (|Perrett et akllMT^ . which matches rest-frame B- 
band for the highest-redshift SNe with small distance 
uncertainties {z ~ 0.7). 

For some surveys, rest-frame U-band selection is a 
poor approximation. Many of the nearby SNe were se¬ 
lected from unfiltered surveys (approximately rest-frame 
i?-band), but these mostly galaxy-targeted surveys have 
generally weak selection effects in m agnitude (discussed 
below). Some d i stant SN surveys (iTonrv e t al.l 120031 
Riess et ffil l2007t lAmamillah et al.l 120101 iSnzuki et 


2012t iRubin et al.l 120131 ) had selections that were dif¬ 


ferent (bluer) than rest-frame R-band. At least for the 
HST-discovered SNe, selection effects are small for most 
of the redshift range (also discussed more below). 

We estimate the selection effects for each sample as 
follows (summarized in Table H]). Nearby SNe are 
generally limited by spectroscopic followup, for exam- 
ple the ^ 1 8 .5 ma gnitude limit for CfA discussed in 
iHicken et al.l 1)20091) (although this is unfiltered, we ap¬ 
proximate it as R-band). We take this limit as typical. 
The Calan/Tololo su rvey (iH amnv et al.lll996allbl ) and the 
SCP nearby search (jKowalski et al.ll2008D extend out to 
higher r edshift; the limiting m agnitude in this case is 
i? ^ 19 (|Hamuv fc: Pintolll999D. We ta ke the magnitude 
limit for SDSS from Dildavet_aU (|2008D and the limit for 
SNLS from iPerrett et alT i 20121) . Together, these sam¬ 
ples make up most of the mid-redshift weight. For the 
other mid-redshift samples, we take a limit of i? = 24, 
judged from the approximate rolloff of the SN popula¬ 
tion in redshift. For the high-redshift ground-discovered 
samples, we assume the surveys were 50% complete at 
z = 1; this gives a limit of 23.8 in /-band. Finally, for the 
HST-d iscovered SNe, we take a limit from iBarbarv et all 
()2012t) of z-band ^ 25 (Vega). In all cases, we take 
the width of the selection to be ±0.5 magnitudes (i.e., 
a SN 0.5 magnitudes brighter than the mean cut has 
an 84% chance of being selected; a SN 0.5 magnitudes 
fainter has a 16% chance). As a cross-check, we coher¬ 
ently shift each estimated magnitude limit fainter by 0.5 
magnitudes, representing an extreme limit of how inac¬ 
curate our estimations are likely to be. The flm credible 
interval shifts by only 0.006. 

We remove the Union2.1 covariance matrix terms for 
Malmquist bias before computing the ht shown in the 
bottom line of Figure [5l The Dm credible interval shifts 
to lower Dm, as the distant SNe with significant selection 
effects are standardized fainter. The central value of this 
final fit closely matches the original Union2.1 result, but 
we see that our new, smaller (by 7% compared to the 
Union2.1 analysis) credible interval shows an increase in 
statistical power. It is worth reiterating that the list 


As the present paper is primarily focused on methods, a very 
detailed treatment of the data is beyond its scope. 
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TABLE 3 

Sample selection effects. 


Sample 

Mag Limit 

rriB Limit (/c-corrected at z) 

Nearby SNe 

R 18.5 ± 0.5 

18.6 z = 0.04 

Calan/Tololo 

R 19.0 ± 0.5 

19.3 z = 0.1 

SCP Nearby 

R 19.0 ± 0.5 

19.3 z = 0.1 

SUSS 

r 22.1 ± 0.5 (AB) 

22.6 z = 0.4 

SNLS 

i 24.3 ± 0.5 (AB) 

25 z = 1 

Other mid-redshift 

R 24 ± 0.5 

24.5 z = 0.8 

Jdigh-redshitt ground 

1 23.8 ± 0.5 

25.0 z = 1.0 

HST ACS 

z 25.0 ± 0.5 (Vega) 

26.1 z = 1.5 


Note. — Approximate magnitude limits for each sample. We use a SALT2 
c — 0 SN template to convert each magnitude limit to the limit in rest-frame 
i?-band magnitude at the specified redshift. 


of improvements to make was established before the re¬ 
sults were known; we did not set out to simply achieve 
a similar result to Union2.1. The scatter of the interme¬ 
diate results generally validates the size of the Union2.1 
systematics estimates for these effects that we can now 
properly include in the model. 

4.9. Other Parameters 

We now present the results for important nuisance pa¬ 
rameters and their relation to in the form of ID and 
2D credible regions. In both cases, our credible regions 
are derived by using Kernel Density Estimation with a 
Gaussian kernel on the MCMC sampleslHl then solving 
for the contour level that encloses 68.3% (inner shaded 
regions) and 95.4% (outer shaded regions) of the poste¬ 
rior 

Figure[6]shows the significant degeneracy between both 
host-mass standardization parameters and Dm, illustrat¬ 
ing that neither one should be neglected. The mean 
value of the estimated fraction of outliers is similar to the 
12/592 found by the Union sigma clipping. This param¬ 
eter has no significant degeneracy with any parameter 
for the spectroscopically confirmed datasets in Union2.1, 
confirming that outlier rejection is not a significant con¬ 
cern at this high level of purity. As an additional cross¬ 
check, instead of assuming the outlier distribution is cen¬ 
tered on the SN la distribution, we fit for it. Including 
six parameters in the model for the mean and dispersion 
in {rriB, a^i, c} (taken to be uncorrelated, and assuming 
a constant distribution with redshift) leaves the error bar 
unchanged but shifts the credible region by 0.009 in D^. 
If future versions of the UNITY framework are run on 
samples with larger contamination, a more flexible out¬ 
lier parameterization can be matc hed with the increa sed 
number of outliers in the fit fe.g.. iHlozek et ani2012fl . 

Figure [7] shows the degeneracy between the fraction 
of the unexplained variance in Xi and a, and similarly 
c//3. Including these unexplained dispersion parameters 
increases the uncertainty and the mean value for a and 
/3. (Here, a and /3 represent the mean standardization 

We draw ~3,00 0 samples from sixteen chains. The 
IGelman fc Rubiiil II1992I ') R statistics are < 1.01, indicating good 
convergence. 

Kernel Density Estimation gives biased results when the pos¬ 
terior samples lie against a parameter boundary, as the density 
will be smoothed including regions with zero density. For all our 
variables, we reflect the nearby samples through the minimum and 
maximum samples, creating a virtual dataset on the other side of 
such boundaries. 


coefficient.) The model prefers most of the unexplained 
dispersion in rriB and c, rather than xi. We also see sta¬ 
tistical evidence of non-zero da and dP (recall that da 
is the xi standardization coefficient for broad-light-curve 
SNe minus the coefficient for narrow SNe; dP is similarly 
defined for red minus blue SNe): da is —0.089lQ Q 4 g and 

dp is 2.26;° j^E3 Both have a correlation with al¬ 
though the correlation with dP is larger. We note, at 
least for this compilation of SNe and SALT2-1, that blue 
SNe still have a non-zero /3: 0.561 o.34j and red SNe have 
a ,5 of 2.83/jQ ) 7 . This value is significantly less than 
4.1 (the value expected if all reddening were due to the 
mean extinction law of the Milky Way diffuse interstellar 
medium) 0 

5. CONCLUSIONS 

In this work, we propose UNITY, a unified Bayesian 
model for handling outliers, selection effects, shape and 
color standardizations, unexplained dispersion, and het¬ 
erogeneous observations. We demonstrate the method 
with the Union2.1 SN compilation, and show that our 
method has 7% smaller uncertainties, but results that are 
consistent with the Union2.1 analysis. The advantages of 
UNITY will likely be even larger in upcoming datasets, 
as scarce followup resources will introduce heterogene¬ 
ity, and enlarged samples may reduce the cosmological- 
parameter statistical uncertainties below the size of our 
improvements. 

There are several future directions for this research. 
We could allow the unexplained dispersion covariance 
matrix to vary with other parameters (ii, c, redshift, 
or rest-frame wavelength range). We could decompose 
the unexplai ned dispersion i nto a sum of Gaussians as in 
the model of iRigault et al.l (1201311. We could use Ga us- 
sian process regression ( R.asmussen fc Williamsl[2006[ l to 
handle the redshift-dependent priors on Xi and c. Our 
model also lets us include more than one light-curve fit 
for each SN, with the covariance matrix between different 
light-curve fitters parameterized. We could even make 
use of the hierarchical model to constrain “exotic pos¬ 
sibilities,” as sketched in Section 11.31 All of these are 

This extreme difference in the color-standardization relation 
with color likely implies that SALT2 should be retrained with a 
color-dependent color law. 

Although we divide the broken-linear relation at c = 0 in our 
standard analysis, letting the division be a fit parameter yields a 
division at c = —0.016±0.018; the uncertainties on 0^ significantly 
increase. 
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Fig. 6.— Credible intervals and regions of the mass-standardization coefficient 5(0), its variation with redshift, 5(oo)/5(0), the estimated 
outlier fraction, Mb, and Qm- The dark shading is the 68.3% credible region; the lighter shading is 95.4%. The mass-standardization 
coefficients have a significant correlation with cosmological parameters, but the estimated outlier fraction does not. 

straightforward modifications of what we present here, 
but our proposed model is already superior to current 
SN cosmological analysis frameworks. We believe these 
concepts will also be useful for other applications. 
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Fig. 7.— Credible intervals and regions of the unexplained variance in {mg, xi, c}, the standardization coefficients, and the cosmological 
parameters. The dark shading is the 68.3% credible region; the lighter shading is 95.4%. 
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APPENDIX 

A. DETAILS OF THE FULL FRAMEWORK 

Our model is a two-component mixture model, where the outlier component is a Gaussian in niB, xi, and c, and 
the normal SNe la component is a Gaussian in niB, xi, and c, truncated by selection effects. The full likelihood for a 
single SN is given by the sum of the core distribution 
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P(detect|{mg)®,a::r^:Cr®}) and P(detect) are described in Equations IB2I and IBlOi respectively, mgr is not a free 
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parameter; instead it is completely determined by other parameters and is given by 

mgr = Mb - a(:rgr ) 4™" + > IO^^Mq) + ^(z„ cosmo) . 

Ajms, xi, c} are the contributions from the systematic uncertainty terms, e.g.. 
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The broken-linear slopes are defined as 


a{x 


true\ 
li ) 




and S(zi) is given by Equation [5] 

We take the following priors on all parameters: 



^true 

< 0 

a^, 

^true 

> 0 


„true 

< 0 

/3«, 

true 

> 0 


(A5) 

(A6) 


Mb -W(-20,-18) 
tan-^a^ ~W(-0.2,0.3) 
tan-^a^ -W(-0.2,0.3) 
tan-V^ -W(-1.4,1.4) 
tan“^ ~ W(—1,1.4) 

^(0) ~W(-0.2,0.3) 

(5(oo) ~W(0,1)5(0) 

U^^W(0,1) 

logio ~ ^(logio 0-05, logio 0-50) 

^ LKJ(l) 

xrr^AfK^.ART)^) 

~ SkewNormal(mean = c*^,, variance = (Rj) 
5|“^ ~ W(-0.995,0.995) 
logio-Rf -^W(-0.5,0.5) 
logio i?|-W(-1.5,-0.5) 

^ijk ~ Cauchy(0,1) 
c*). ~ Cauchy(0,0.3) 
log/°“*^ ~Af(-3,0.52) 


Absolute Magnitude for h = 0.7 
Xi Standardization Coefficient, a:)™® < 0 
xi Standardization Coefficient, x)™® > 0 
c Standardization Coefficient, cf'^® < 0 
c Standardization Coefficient, > 0 
Host-Mass Standardization Coefficient, z = 0 
Host-Mass Standardization Coefficient, z = oo 

flm (flat ACDM) 
“Sample” (Unexplained) Dispersion 
Fraction of Unexplained Dispersion in {mB,a;i,c} 
Correlation Matrix of Unexplained Dispersion 

Modeled Latent xi 

Of = Modeled Latent c 

Related to SkewNormal Shape Parameter 
Dispersion of Latent xi 
Dispersion of Latent c 
Mean of Latent Xi 
Mean of Latent c 
Fraction of Outliers 


B. DETAILS OF THE SELECTION-EFFECT MODEL 

In this section, we detail the approach we take for modeling selection effects in UNITY. We start with a general 
likelihood containing missing observations in the case when both the number of observed objects (7V°'^®) and the 
number of missed objects (iV™'®®) are exactly known: 
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As described in Section [2^ we assume that the efficiency smoothly varies with {m^j®, cci^®, c°'^®}: 
P(detect|{m°B')^x^^cf®}) = $ 


m 


/™obs I „cut™obs 


^cut^ob®) 
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(B2) 


where 4) is the Gaussian CDF. Of course, we do not know any of the parameters for each of the missing observations. 
P(detect) is thus marginalized over the entire distribution when it is not referencing a specific observation. 

We now make the counterintuitive approximation that the redshift of each missed SN is exactly equal to the redshift 
of a detected SN. This approximation is accurate because the SN s amp les have, on average, enough SNe that the 
redshift distribution is resonably sampled. We now refactor Equation [Bl] to consider each detected SN: 
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The combinatoric factor trivially becomes (1 + iV™'®®). To minimize the number of parameters in the cosmological 
fit, we now seek to marginalize over (1 + from 1 to oo. It is common (e.g., iGelman et al.l[20T^ lKellvll2007[ l to 

take a flat prior on log for reasons that will become clear in a moment, we force this prior to zero faster than 

this by multiplying this (1 + prior by a weak exponential decay 


P{1 + = 


exp(-e(iVf 


1 )) 


1 + 

where e is a small positive number. Thus, the marginalization becomes: 


(B4) 


oo 

n X] Xu"", cf "}|params)P(detect|{mg’", c°'""})(l - P(detect|z,))^‘”"'' exp (-e(iV“‘"" + 1)) 

i=l (1+Ar“i='') = l 

which is equal to the geometric series 

oo 

n X] -P({«T-'K"ia:^i^")C?'""}|params)P(detect|{m'^‘’",x?^",c°'""})((l - P(detect|z,)) exp exp (-e) 

i=l (1+Ar™i='') = l 

(B6) 

which sums to 
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jyobs 
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P({to'^ 7, , cf "}|params)P(detect|{m|jf, x^Jf, cf"}) 

exp (e) — (1 — P (detect I Zi)) 

x?^", cf "}|params)P(detect|{mg’", cf"}) 

e + P (detect I Zi) 


(B7) 


This expression highlights the benefit of this prior. If a flat prior on log (7V™“®) is assumed (e = 0), then the likelihood 
can be poorly behaved in regions of parameter space where the efficiency is poor (even if care is taken in the log 
likelihood to correctly handle the asymptotes of the log of the efficiency). Running UNITY on Union2.I with e either 0 
or O.OI makes virtually no difference, but Stan has difhculty with many of the simulated datasets (which we constructed 
with severe Mahnquist bias as a test, see Section (H) for e = 0, so we make e = 0.01 our default choice. 

We now need the efficiency as a function of the parameters. We derived the following expression by analytically 
computing the variance, then expanding to first order in x\ (this approximation is good to better than one percent in 
the variance for our distributions). For simplicity, we use the same expression for c, although the c distribution is skew- 
normal instead of normal. (Changing the assumed intrinsic color distribution to normal, rather than skew-normal, 
changes Um by only 0.001, so this approximation is not a significant concern.) 


=Cr^+al^, + {5{z.)/2f (B8) 

+ ((a - af‘) + {a- a""*) da xl - + {da 

j j y 7 ^ 47 ^ 

+ ((/3 + 6™*) P")2 + (/3 + 6™‘) d/3 c* PS/ - + (d/3 P")"X^ ■ 

j j y TT 47 r 

The first line contains the measurement and unexplained dispersion in m^, the dispersion of the selection cut, and 
the dispersion due to the host-galaxy relation. The second and third lines give the dispersion due to the xi — ms 
and c — niB relations, respectively. Using the same approximations, we derived an expression for the mean of the 
(unstandardized) magnitude distribution 

mean = Mb + tJ.{zi) + {bj'^^ + P) c* + -(““5“* + a) x^ - ^ ; (B9) 

in this case, our approximations are valid to a few hundredths of a magnitude. We then compute the selection efficiency 
assuming that the distribution of magnitudes is Gaussian 

P(detect|zi) = ^((to^* — mean)/-\/U’"®) . (BIO) 

This approximation is not completely valid, as there is a tail to faint magnitudes of red SNe, but for the central part 
of the distribution (which is the part that is cut into by severe Mahnquist bias), it is a good approximation. 













